Goto

Collaborating Authors

 matrix multiplication algorithm


$XX^{t}$ Can Be Faster

Rybin, Dmitry, Zhang, Yushun, Luo, Zhi-Quan

arXiv.org Artificial Intelligence

We present RXTX, a new algorithm for computing the product of matrix by its transpose $XX^{t}$ for $X\in \mathbb{R}^{n\times m}$. RXTX uses $5\%$ fewer multiplications and $5\%$ fewer operations (additions and multiplications) than State-of-the-Art algorithms. Note that the accelerations not only holds asymptotically for large matrices with $n \rightarrow \infty$, but also for small matrices including $n = 4$. The algorithm was discovered by combining Machine Learning-based search methods with Combinatorial Optimization.


Improving Parallel Program Performance Through DSL-Driven Code Generation with LLM Optimizers

Wei, Anjiang, Nie, Allen, Teixeira, Thiago S. F. X., Yadav, Rohan, Lee, Wonchan, Wang, Ke, Aiken, Alex

arXiv.org Artificial Intelligence

Mapping computations to processors and assigning data to memory are critical for maximizing performance in parallel programming. These mapping decisions are managed through the development of specialized low-level system code, called mappers, crafted by performance engineers. Each mapper is tailored to a specific application and optimized for the underlying machine architecture, a process that requires days of refinement and tuning from an expert. Despite advances in system research, automating mapper generation remains a challenge due to the complexity of making millions of decisions to find the optimal solution and generate the solution as code. We introduce an approach that leverages recent advances in LLM-based optimizers for mapper design. In under ten minutes, our method automatically discovers mappers that surpass human expert designs in scientific applications by up to 1.34X speedup. For parallel matrix multiplication algorithms, our mapper achieves up to 1.31X of the expert-designed solution. To achieve this, we simplify the complexity of low-level code generation by introducing a domain-specific language (DSL) that abstracts the low-level system programming details and defines a structured search space for LLMs to explore. To maximize the application performance, we use an LLM optimizer to improve an agentic system that generates the mapper code. As a result, this approach significantly reduces the workload for performance engineers while achieving substantial performance gains across diverse applications. Finally, our results demonstrate the effectiveness of LLM-based optimization in system design and suggest its potential for addressing other complex system challenges.


OpenTensor: Reproducing Faster Matrix Multiplication Discovering Algorithms

Sun, Yiwen, Li, Wenye

arXiv.org Artificial Intelligence

Matrix multiplication (MM) is a fundamental numerical operation that is used everywhere. To search for faster MM algorithms, DeepMind proposed AlphaTensor [1] based on AlphaZero [3] and constructed a Monte Carlo Tree Search (MCTS) architecture. AlphaTensor [1] not only finds a faster algorithm for matrix multiplication but also provides a new paradigm for using machine learning to solve scientific problems. However, due to the lack of open-source codes and too many algorithmic tricks, researchers may get lost in the myriad of details and find it hard to understand the key points, let alone reproduce the performance and implement it to solve other problems. In this paper, we reproduce AlphaTensor [1] and hope that it will be helpful for others to fully understand the scientific problem-solving paradigm.


HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption

Lee, Seewoo, Lee, Garam, Kim, Jung Woo, Shin, Junbum, Lee, Mun-Kyu

arXiv.org Artificial Intelligence

Transfer learning is a de facto standard method for efficiently training machine learning models for data-scarce problems by adding and fine-tuning new classification layers to a model pre-trained on large datasets. Although numerous previous studies proposed to use homomorphic encryption to resolve the data privacy issue in transfer learning in the machine learning as a service setting, most of them only focused on encrypted inference. In this study, we present HETAL, an efficient Homomorphic Encryption based Transfer Learning algorithm, that protects the client's privacy in training tasks by encrypting the client data using the CKKS homomorphic encryption scheme. HETAL is the first practical scheme that strictly provides encrypted training, adopting validation-based early stopping and achieving the accuracy of nonencrypted training. We propose an efficient encrypted matrix multiplication algorithm, which is 1.8 to 323 times faster than prior methods, and a highly precise softmax approximation algorithm with increased coverage. The experimental results for five well-known benchmark datasets show total training times of 567-3442 seconds, which is less than an hour.


Better Algorithms through Faster Math

Communications of the ACM

Developing faster algorithms is an important but elusive goal for data scientists. The ability to accelerate complex computing tasks and reduce latency has far-reaching ramifications in areas such as natural language processing, video streaming, autonomous robotics, gaming, and extended reality. Yet for all the hype surrounding computer algorithms and the increasingly sophisticated ways they operate, a basic fact stands out: these algorithms are typically built atop matrix multiplication, a basic type of linear algebra. The underlying mathematical framework has not changed a great deal since the inception of computing--and finding more efficient formulas has proved elusive. It is an issue attracting growing attention--particularly as machine learning (ML), deep learning (DL), artificial intelligence (AI), and machine automation advance into the mainstream.


DeepMind AlphaTensor: The delicate balance between human and artificial intelligence

#artificialintelligence

This article is part of our coverage of the latest in AI research. DeepMind has made another impressive artificial intelligence announcement with AlphaTensor, a deep reinforcement learning system that discovers algorithms to make matrix multiplications much more efficient. Matrix multiplication is at the heart of many computational tasks, including neural networks, 3D graphics, and data compression. Therefore, there are many immediate applications for an AI system that can improve the efficiency of matrix multiplication. To create AlphaTensor, scientists at DeepMind used AlphaZero, the deep learning system that previously mastered board games like go, chess, and shogi.


How DeepMind's AlphaTensor AI Devised a Faster Matrix Multiplication & More Latest News - Up Jobs

#artificialintelligence

After growing a man-made intelligence that may obtain superhuman mastery of video games like chess and go, along with one other AI that may predict how proteins fold themselves in three-dimensional area, the researchers over at DeepMind have completed it once more -- this time utilizing a deep studying AI mannequin to effectively clear up a elementary arithmetic downside, whereas beating a 50-year-old document besides. In a weblog put up from earlier this month, the DeepMind group introduces AlphaTensor, an AI system that's designed for locating new and extra environment friendly algorithms for fixing essential mathematical operations -- on this case, matrix multiplication. Whether they're used to course of or compress pictures or video, recognizing spoken instructions, or working simulations to foretell the climate, matrix multiplication underpins a lot of recent computing. So it's little surprise that consultants and firms everywhere in the world are continuously in search of extra environment friendly methods to enhance the algorithms for fixing these mathematical operations behind such duties. Matrix multiplication is without doubt one of the easiest mathematical operations in algebra, the place particular person numbers which might be organized in grids -- or matrices -- are multiplied collectively after which added in particular manner with the intention to generate a new matrix.


DeepMind's AlphaTensor: Deepmind's Alphatensor: The AI That Is Reinventing Math

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. Without realizing it, any of our activities, in one way or another, involve matrix multiplications.


How DeepMind's AlphaTensor AI Devised a Faster Matrix Multiplication

#artificialintelligence

After developing an artificial intelligence that can achieve superhuman mastery of games like chess and go, in addition to another AI that can predict how proteins fold themselves in three-dimensional space, the researchers over at DeepMind have done it again -- this time using a deep learning AI model to efficiently solve a fundamental mathematics problem, while beating a 50-year-old record to boot. In a blog post from earlier this month, the DeepMind team introduces AlphaTensor, an AI system that is designed for discovering new and more efficient algorithms for solving crucial mathematical operations -- in this case, matrix multiplication. Whether they are used to process or compress images or video, recognizing spoken commands, or running simulations to predict the weather, matrix multiplication underpins much of modern computing. So it's little wonder that experts and companies all over the world are constantly looking for more efficient ways to improve the algorithms for solving these mathematical operations behind such tasks. Matrix multiplication is one of the simplest mathematical operations in algebra, where individual numbers that are arranged in grids -- or matrices -- are multiplied together and then added in specific way in order to generate a new matrix.


AlphaTensor and Its Implications for AI, Reinforcement Learning, and Science - DataScienceCentral.com

#artificialintelligence

Last week Deepmind announced AlphaTensor, a mathematics package that follows AlphaFold and continues the tradition of using AI to expand the horizons of science. In this case, the problem is matrix multiplication, known to most people from their high school days. The issue is not just the actual multiplication but the fastest method to perform the multiplication. The speeding up of matrix multiplication calculations has a high impact because matrix multiplication is a part of many applications – especially in deep learning and image processing. AlphaTensor models matrix multiplication problems as games and trains the AlphaTensor agent using reinforcement learning.